parallel streaming wasserstein barycenter
Parallel Streaming Wasserstein Barycenters
Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently. These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements. Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets. One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry. However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself.
- Information Technology > Artificial Intelligence (0.42)
- Information Technology > Communications > Networks (0.39)
Reviews: Parallel Streaming Wasserstein Barycenters
Title: Parallel Streaming Wasserstein Barycenters Comments: - This paper presents a new method for performing low-communication parallel inference via computing the Wasserstein barycenter of a set of distributions. Unlike previous work, this method aims to reduce certain approximations incurred by discretization. Theoretically, this paper gives results involving the rate of the convergence of the barycenter distance. Empirically, this paper shows results on a synthetic task involving a Von Mises distribution and on a logistic regression task. It would be better to clearly (and near the beginning of the paper) give an intuition behind the methodology improvements that this paper aims to provide, relative to previous work on computing the Wasserstein barycenter.
Parallel Streaming Wasserstein Barycenters
Staib, Matthew, Claici, Sebastian, Solomon, Justin M., Jegelka, Stefanie
Efficiently aggregating data from different sources is a challenging problem, particularly when samples from each source are distributed differently. These differences can be inherent to the inference task or present for other reasons: sensors in a sensor network may be placed far apart, affecting their individual measurements. Conversely, it is computationally advantageous to split Bayesian inference tasks across subsets of data, but data need not be identically distributed across subsets. One principled way to fuse probability distributions is via the lens of optimal transport: the Wasserstein barycenter is a single distribution that summarizes a collection of input measures while respecting their geometry. However, computing the barycenter scales poorly and requires discretization of all input distributions and the barycenter itself.
- Information Technology > Artificial Intelligence (0.49)
- Information Technology > Communications > Networks (0.41)